Goto

Collaborating Authors

 Personal Products


5 personal care products that solved real problems in 2025

Popular Science

Technology Best of What's New 5 personal care products that solved real problems in 2025 We may earn revenue from the products available on this page and participate in affiliate programs. In a market saturated with wellness products that promise to fix your whole life but rarely deliver much of anything, this year's personal care winners stand out for actually solving real problems. The 2025 class represents genuine inclusivity and thoughtful design--from a breast pump that goes old school to level up its wearability, to world-class headphones that double as hearing aids and workout coaches. Instead, they address overlooked challenges with smart engineering: making fragrance bottles easier to grip, transforming sleep routines for exhausted parents, and rethinking recovery gear so athletes can soothe strained muscles while on the move. Each winner proves that meaningful innovation happens when companies consider users' actual needs--and use that knowledge to make good products great.


The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data

Luca, Massimiliano, Beneduce, Ciro, Lepri, Bruno, Staiano, Jacopo

arXiv.org Artificial Intelligence

With the wide and cross-domain adoption of Large Language Models, it becomes crucial to assess to which extent the statistical correlations in training data, which underlie their impressive performance, hide subtle and potentially troubling biases. Gender bias in LLMs has been widely investigated from the perspectives of works, hobbies, and emotions typically associated with a specific gender. In this study, we introduce a novel perspective. We investigate whether LLMs can predict an individual's gender based solely on online shopping histories and whether these predictions are influenced by gender biases and stereotypes. Using a dataset of historical online purchases from users in the United States, we evaluate the ability of six LLMs to classify gender and we then analyze their reasoning and products-gender co-occurrences. Results indicate that while models can infer gender with moderate accuracy, their decisions are often rooted in stereotypical associations between product categories and gender. Furthermore, explicit instructions to avoid bias reduce the certainty of model predictions, but do not eliminate stereotypical patterns. Our findings highlight the persistent nature of gender biases in LLMs and emphasize the need for robust bias-mitigation strategies.


Semantic-Metric Bayesian Risk Fields: Learning Robot Safety from Human Videos with a VLM Prior

Chen, Timothy, Dominguez-Kuhne, Marcus, Swann, Aiden, Liu, Xu, Schwager, Mac

arXiv.org Artificial Intelligence

Humans interpret safety not as a binary signal but as a continuous, context- and spatially-dependent notion of risk. While risk is subjective, humans form rational mental models that guide action selection in dynamic environments. This work proposes a framework for extracting implicit human risk models by introducing a novel, semantically-conditioned and spatially-varying parametrization of risk, supervised directly from safe human demonstration videos and VLM common sense. Notably, we define risk through a Bayesian formulation. The prior is furnished by a pretrained vision-language model. In order to encourage the risk estimate to be more human aligned, a likelihood function modulates the prior to produce a relative metric of risk. Specifically, the likelihood is a learned ViT that maps pretrained features, to pixel-aligned risk values. Our pipeline ingests RGB images and a query object string, producing pixel-dense risk images. These images that can then be used as value-predictors in robot planning tasks or be projected into 3D for use in conventional trajectory optimization to produce human-like motion. This learned mapping enables generalization to novel objects and contexts, and has the potential to scale to much larger training datasets. In particular, the Bayesian framework that is introduced enables fast adaptation of our model to additional observations or common sense rules. We demonstrate that our proposed framework produces contextual risk that aligns with human preferences. Additionally, we illustrate several downstream applications of the model; as a value learner for visuomotor planners or in conjunction with a classical trajectory optimization algorithm. Our results suggest that our framework is a significant step toward enabling autonomous systems to internalize human-like risk. Code and results can be found at https://riskbayesian.github.io/bayesian_risk/.



The Tree Autoencoder Model, with Application to Hierarchical Data Visualization

Neural Information Processing Systems

W e propose a new model for dimensionality reduction, the PCA tree, which works like a regular autoencoder, having explicit projection and reconstruction mappings. The projection is effected by a sparse oblique tree, having h ard, hyperplane splits using few features and linear leaves. The reconstruction ma pping is a set of local linear mappings. Thus, rather than producing a global ma p as in t-SNE and other methods, which often leads to distortions, it produce s a hierarchical set of local PCAs. The use of a sparse oblique tree and of PCA in its le aves makes the overall model interpretable and very fast to project or r econstruct new points. Joint optimization of all the parameters in the tree is a nonc onvex nondifferen-tiable problem. W e propose an algorithm that is guaranteed t o decrease the error monotonically and which scales to large datasets without an y approximation. In experiments, we show PCA trees are able to identify a wealth o f low-dimensional and cluster structure in image and document datasets.


LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings

Maier, Benjamin F., Aslak, Ulf, Fiaschi, Luca, Rismal, Nina, Fletcher, Kemble, Luhmann, Christian C., Dow, Robbie, Pappas, Kli, Wiecki, Thomas V.

arXiv.org Artificial Intelligence

Consumer research costs companies billions annually yet suffers from panel biases and limited scale. Large language models (LLMs) offer an alternative by simulating synthetic consumers, but produce unrealistic response distributions when asked directly for numerical ratings. We present semantic similarity rating (SSR), a method that elicits textual responses from LLMs and maps these to Likert distributions using embedding similarity to reference statements. Testing on an extensive dataset comprising 57 personal care product surveys conducted by a leading corporation in that market (9,300 human responses), SSR achieves 90% of human test-retest reliability while maintaining realistic response distributions (KS similarity > 0.85). Additionally, these synthetic respondents provide rich qualitative feedback explaining their ratings. This framework enables scalable consumer research simulations while preserving traditional survey metrics and interpretability.


Towards Reliable Evaluation of Large Language Models for Multilingual and Multimodal E-Commerce Applications

Xie, Shuyi, Liew, Ziqin, Zhang, Hailing, Zhang, Haibo, Hu, Ling, Zhou, Zhiqiang, Liu, Shuman, Zeng, Anxiang

arXiv.org Artificial Intelligence

Large Language Models (LLMs) excel on general-purpose NLP benchmarks, yet their capabilities in specialized domains remain underexplored. In e-commerce, existing evaluations-such as EcomInstruct, ChineseEcomQA, eCeLLM, and Shopping MMLU-suffer from limited task diversity (e.g., lacking product guidance and after-sales issues), limited task modalities (e.g., absence of multimodal data), synthetic or curated data, and a narrow focus on English and Chinese, leaving practitioners without reliable tools to assess models on complex, real-world shopping scenarios. We introduce EcomEval, a comprehensive multilingual and multimodal benchmark for evaluating LLMs in e-commerce. EcomEval covers six categories and 37 tasks (including 8 multimodal tasks), sourced primarily from authentic customer queries and transaction logs, reflecting the noisy and heterogeneous nature of real business interactions. To ensure both quality and scalability of reference answers, we adopt a semi-automatic pipeline in which large models draft candidate responses subsequently reviewed and modified by over 50 expert annotators with strong e-commerce and multilingual expertise. We define difficulty levels for each question and task category by averaging evaluation scores across models with different sizes and capabilities, enabling challenge-oriented and fine-grained assessment. EcomEval also spans seven languages-including five low-resource Southeast Asian languages-offering a multilingual perspective absent from prior work.


New recycling method turns Teflon into toothpaste fluoride

Popular Science

The approach converts the toxic nonstick coating into harmless ingredients. Breakthroughs, discoveries, and DIY tips sent every weekday. The common coating known as Teflon can keep food from sticking to cookware, but it's notoriously difficult to break down safely. Now, researchers in the United Kingdom have discovered a simple and cost-effective solution to the problem. The results aren't simply eco-friendly--they can also be upcycled into helpful toothpaste and drinking water additives.


MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation

Park, Jiin, Kim, Misuk

arXiv.org Artificial Intelligence

Recent attempts to integrate large language models (LLMs) into recommender systems have gained momentum, but most remain limited to simple text generation or static prompt-based inference, failing to capture the complexity of user preferences and real-world interactions. This study proposes the Multi-Aspect Driven LLM Agent MADRec, an autonomous LLM-based recommender that constructs user and item profiles by unsupervised extraction of multi-aspect information from reviews and performs direct recommendation, sequential recommendation, and explanation generation. MADRec generates structured profiles via aspect-category-based summarization and applies Re-Ranking to construct high-density inputs. When the ground-truth item is missing from the output, the Self-Feedback mechanism dynamically adjusts the inference criteria. Experiments across multiple domains show that MADRec outperforms traditional and LLM-based baselines in both precision and explainability, with human evaluation further confirming the persuasiveness of the generated explanations.